185 research outputs found

    PeakRegressor Identifies Composite Sequence Motifs Responsible for STAT1 Binding Sites and Their Potential rSNPs

    Get PDF
    How to identify true transcription factor binding sites on the basis of sequence motif information (e.g., motif pattern, location, combination, etc.) is an important question in bioinformatics. We present “PeakRegressor,” a system that identifies binding motifs by combining DNA-sequence data and ChIP-Seq data. PeakRegressor uses L1-norm log linear regression in order to predict peak values from binding motif candidates. Our approach successfully predicts the peak values of STAT1 and RNA Polymerase II with correlation coefficients as high as 0.65 and 0.66, respectively. Using PeakRegressor, we could identify composite motifs for STAT1, as well as potential regulatory SNPs (rSNPs) involved in the regulation of transcription levels of neighboring genes. In addition, we show that among five regression methods, L1-norm log linear regression achieves the best performance with respect to binding motif identification, biological interpretability and computational efficiency

    Dissecting complex transcriptional responses using pathway-level scores based on prior information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genomewide pattern of changes in mRNA expression measured using DNA microarrays is typically a complex superposition of the response of multiple regulatory pathways to changes in the environment of the cells. The use of prior information, either about the function of the protein encoded by each gene, or about the physical interactions between regulatory factors and the sequences controlling its expression, has emerged as a powerful approach for dissecting complex transcriptional responses.</p> <p>Results</p> <p>We review two different approaches for combining the noisy expression levels of multiple individual genes into robust pathway-level differential expression scores. The first is based on a comparison between the distribution of expression levels of genes within a predefined gene set and those of all other genes in the genome. The second starts from an estimate of the strength of genomewide regulatory network connectivities based on sequence information or direct measurements of protein-DNA interactions, and uses regression analysis to estimate the activity of gene regulatory pathways. The statistical methods used are explained in detail.</p> <p>Conclusion</p> <p>By avoiding the thresholding of individual genes, pathway-level analysis of differential expression based on prior information can be considerably more sensitive to subtle changes in gene expression than gene-level analysis. The methods are technically straightforward and yield results that are easily interpretable, both biologically and statistically.</p

    A statistical framework for integrating two microarray data sets in differential expression analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Different microarray data sets can be collected for studying the same or similar diseases. We expect to achieve a more efficient analysis of differential expression if an efficient statistical method can be developed for integrating different microarray data sets. Although many statistical methods have been proposed for data integration, the genome-wide concordance of different data sets has not been well considered in the analysis.</p> <p>Results</p> <p>Before considering data integration, it is necessary to evaluate the genome-wide concordance so that misleading results can be avoided. Based on the test results, different subsequent actions are suggested. The evaluation of genome-wide concordance and the data integration can be achieved based on the normal distribution based mixture models.</p> <p>Conclusion</p> <p>The results from our simulation study suggest that misleading results can be generated if the genome-wide concordance issue is not appropriately considered. Our method provides a rigorous parametric solution. The results also show that our method is robust to certain model misspecification and is practically useful for the integrative analysis of differential expression.</p

    Identification of Direct Target Genes Using Joint Sequence and Expression Likelihood with Application to DAF-16

    Get PDF
    A major challenge in the post-genome era is to reconstruct regulatory networks from the biological knowledge accumulated up to date. The development of tools for identifying direct target genes of transcription factors (TFs) is critical to this endeavor. Given a set of microarray experiments, a probabilistic model called TRANSMODIS has been developed which can infer the direct targets of a TF by integrating sequence motif, gene expression and ChIP-chip data. The performance of TRANSMODIS was first validated on a set of transcription factor perturbation experiments (TFPEs) involving Pho4p, a well studied TF in Saccharomyces cerevisiae. TRANSMODIS removed elements of arbitrariness in manual target gene selection process and produced results that concur with one's intuition. TRANSMODIS was further validated on a genome-wide scale by comparing it with two other methods in Saccharomyces cerevisiae. The usefulness of TRANSMODIS was then demonstrated by applying it to the identification of direct targets of DAF-16, a critical TF regulating ageing in Caenorhabditis elegans. We found that 189 genes were tightly regulated by DAF-16. In addition, DAF-16 has differential preference for motifs when acting as an activator or repressor, which awaits experimental verification. TRANSMODIS is computationally efficient and robust, making it a useful probabilistic framework for finding immediate targets

    Refining the role of laparoscopy and laparoscopic ultrasound in the staging of presumed pancreatic head and ampullary tumours

    Get PDF
    Laparoscopy and laparoscopic ultrasound have been validated previously as staging tools for pancreatic cancer. The aim of this study was to identify if assessment of vascular involvement with abdominal computed tomography (CT) would allow refinement of the selection criteria for laparoscopy and laparoscopic ultrasound (LUS). The details of patients staged with LUS and abdominal CT were obtained from the unit's pancreatic cancer database. A CT grade (O, A-F) of vascular involvement was recorded by a single radiologist. Of 152 patients, who underwent a LUS, 56 (37%) had unresectable disease. Three of 26 (12%) patients with CT grade O, 27 of 88 (31%) patients with CT grade A to D, 17 of 29 (59%) patients with CT grade E and all nine patients with CT grade F were found to have unresectable disease. In all, 24% of patients with tumours <3 cm were found to have unresectable disease. In those patients with tumours considered unresectable, local vascular involvement was found in 56% of patients and vascular involvement with metastatic disease in 17%, while 20% of patients had liver metastases alone and 5% had isolated peritoneal metastases. The remaining patient was deemed unfit for resection. Selective use of laparoscopic ultrasound is indicated in the staging of periampullary tumours with CT grades A to D

    c-REDUCE: Incorporating sequence conservation to detect motifs that correlate with expression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational methods for characterizing novel transcription factor binding sites search for sequence patterns or "motifs" that appear repeatedly in genomic regions of interest. Correlation-based motif finding strategies are used to identify motifs that correlate with expression data and do not rely on promoter sequences from a pre-determined set of genes.</p> <p>Results</p> <p>In this work, we describe a method for predicting motifs that combines the correlation-based strategy with phylogenetic footprinting, where motifs are identified by evaluating orthologous sequence regions from multiple species. Our method, c-REDUCE, can account for variability at a motif position inferred from evolutionary information. c-REDUCE has been tested on ChIP-chip data for yeast transcription factors and on gene expression data in <it>Drosophila</it>.</p> <p>Conclusion</p> <p>Our results indicate that utilizing sequence conservation information in addition to correlation-based methods improves the identification of known motifs.</p

    Regularized gene selection in cancer microarray meta-analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In cancer studies, it is common that multiple microarray experiments are conducted to measure the same clinical outcome and expressions of the same set of genes. An important goal of such experiments is to identify a subset of genes that can potentially serve as predictive markers for cancer development and progression. Analyses of individual experiments may lead to unreliable gene selection results because of the small sample sizes. Meta analysis can be used to pool multiple experiments, increase statistical power, and achieve more reliable gene selection. The meta analysis of cancer microarray data is challenging because of the high dimensionality of gene expressions and the differences in experimental settings amongst different experiments.</p> <p>Results</p> <p>We propose a Meta Threshold Gradient Descent Regularization (MTGDR) approach for gene selection in the meta analysis of cancer microarray data. The MTGDR has many advantages over existing approaches. It allows different experiments to have different experimental settings. It can account for the joint effects of multiple genes on cancer, and it can select the same set of cancer-associated genes across multiple experiments. Simulation studies and analyses of multiple pancreatic and liver cancer experiments demonstrate the superior performance of the MTGDR.</p> <p>Conclusion</p> <p>The MTGDR provides an effective way of analyzing multiple cancer microarray studies and selecting reliable cancer-associated genes.</p

    Endocytosis of plasma-derived factor V by megakaryocytes occurs via a clathrin-dependent, specific membrane binding event

    Full text link
    Megakaryocytes were analyzed for their ability to endocytose factor V to define the cellular mechanisms regulating this process. In contrast to fibrinogen, factor V was endocytosed by megakaryocytes derived from CD34 + cells or megakaryocyte-like cell lines, but not by platelets. CD41 + ex vivo -derived megakaryocytes endocytosed factor V, as did subpopulations of the megakaryocyte-like cells MEG-01, and CMK. Similar observations were made for fibrinogen. Phorbol diester-induced megakaryocytic differentiation of the cell lines resulted in a substantial increase in endocytosis of both proteins as compared to untreated cells that did not merely reflect their disparate plasma concentrations. Factor IX, which does not associate with platelets or megakaryocytes, was not endocytosed by any of the cells examined. Endocytosis of factor V by megakaryocytes proceeds through a specific and independent mechanism as CHRF-288 cells endocytosed fibrinogen but not factor V, and the presence of other plasma proteins had no effect on the endocytosis of factor V by MEG-01 cells. Furthermore, as the endocytosis of factor V was also demonstrated to occur through a clathrin-dependent mechanism, these combined data demonstrate that endocytosis of factor V by megakaryocytes occurs via a specific, independent, and most probably receptor-mediated, event.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/75473/1/j.1538-7836.2005.01190.x.pd

    A Bayesian method for calculating real-time quantitative PCR calibration curves using absolute plasmid DNA standards

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In real-time quantitative PCR studies using absolute plasmid DNA standards, a calibration curve is developed to estimate an unknown DNA concentration. However, potential differences in the amplification performance of plasmid DNA compared to genomic DNA standards are often ignored in calibration calculations and in some cases impossible to characterize. A flexible statistical method that can account for uncertainty between plasmid and genomic DNA targets, replicate testing, and experiment-to-experiment variability is needed to estimate calibration curve parameters such as intercept and slope. Here we report the use of a Bayesian approach to generate calibration curves for the enumeration of target DNA from genomic DNA samples using absolute plasmid DNA standards.</p> <p>Results</p> <p>Instead of the two traditional methods (classical and inverse), a Monte Carlo Markov Chain (MCMC) estimation was used to generate single, master, and modified calibration curves. The mean and the percentiles of the posterior distribution were used as point and interval estimates of unknown parameters such as intercepts, slopes and DNA concentrations. The software WinBUGS was used to perform all simulations and to generate the posterior distributions of all the unknown parameters of interest.</p> <p>Conclusion</p> <p>The Bayesian approach defined in this study allowed for the estimation of DNA concentrations from environmental samples using absolute standard curves generated by real-time qPCR. The approach accounted for uncertainty from multiple sources such as experiment-to-experiment variation, variability between replicate measurements, as well as uncertainty introduced when employing calibration curves generated from absolute plasmid DNA standards.</p
    corecore